Extracting Topics and Innovators Using Topic Diffusion Process in Weblogs

نویسندگان

  • Tadanobu Furukawa
  • Yutaka Matsuo
  • Ikki Ohmukai
  • Koki Uchiyama
  • Mitsuru Ishizuka
چکیده

The diffusion process on weblogs has attracted great interest since the early days of weblog studies. We propose a ranking technique which extracts topics and innovators by analyzing that process. Our method identifies URLs of topics and the bloggers who trigger topic diffusion. Our assumption is that the strength of propagation of a topic is determined by the influences of topics and bloggers. This decomposition is attained through singular value decomposition (SVD); We construct a diffusion matrix: the first left and right singular vectors are regarded respectively as the influences of the topics and the bloggers. We show that our method can extract propagative topics (which is not bursty because of the media effect) as well as the influential bloggers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A review of text mining approaches and their function in discovering and extracting a topic

Background and aim: Four text mining methods are examined and focused on understanding and identifying their properties and limitations in subject discovery. Methodology: The study is an analytical review of the literature of text mining and topic modeling.  Findings: LSA could be used to classify specific and unique topics in documents that address only a single topic. The other three text min...

متن کامل

Extracting Topics From Weblogs Through Frequency Segments

In this paper, we present an approach to extracting topics from weblogs by using terms that appear in them. We model a term in terms of frequency segments, i.e., sequential occurrences of the term over time, as the unit of characterization. A notable feature of the model is its approximation of changes in the dynamics of term frequencies; it captures the granularity of frequencies from the very...

متن کامل

بررسی محتوای یادداشت‌های ارسالی و نظرات وبلاگ‌های فردی و گروهی کتابداری و اطلاع‎رسانی فارسی

The present study employed a content analysis method for analyzing the posts and comments in 85 individual and 31 collective weblogs published in Farsi on the subject of Library and information science. Studies showed that the average monthly postings in collective weblog are more than individual weblogs, while regarding the comments posted the reverse is true. The highest numbers of postings i...

متن کامل

یک مدل موضوعی احتمالاتی مبتنی بر روابط محلّی واژگان در پنجره‌های هم‌پوشان

A probabilistic topic model assumes that documents are generated through a process involving topics and then tries to reverse this process, given the documents and extract topics. A topic is usually assumed to be a distribution over words. LDA is one of the first and most popular topic models introduced so far. In the document generation process assumed by LDA, each document is a distribution o...

متن کامل

Extracting Domain-Dependent Semantic Orientations of Latent Variables for Sentiment Classification

Sentiment analysis of weblogs is a challenging problem. Most previous work utilized semantic orientations of words or phrases to classify sentiments of weblogs. The problem with this approach is that semantic orientations of words or phrases are investigated without considering the domain of weblogs. Weblogs contain the author’s various opinions about multifaceted topics. Therefore, we have to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008